HIERARCHICAL MODELS IN STATISTICAL INVERSE PROBLEMS 
AND THE MUMFORD SHAH FUNCTIONAL * 



TAPIO HELINt AND MATTI LASSAS* 

Abstract. The Bayesian methods for Unear inverse problems is studied using hierarchical Gaus- 
sian models. The problems are considered with different discretizations, and we analyze the phenom- 
ena which appear when the discretization becomes finer. A hierarchical solution method for signal 
restoration problems is introduced and studied with arbitrarily fine discretization. We show that the 
maximum a posteriori estimate converges to a minimizer of the Mumford-Shah functional, up to a 
subsequence. A new result regarding the existence of a minimizer of the Mumford— Shah functional 
is proved. 

Moreover, we study the inverse problem under different assumptions on the asymptotic behavior 
of the noise as discretization becomes finer. We show that the maximum a posteriori and conditional 
mean estimates converge under different conditions. 
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1. Introduction. We study hierarchical Bayesian methods for linear inverse 
problems. In particular, we consider inverse problems with different discretizations 
and the phenomena which appear when the discretization is refined. The effect of fine 
discretization has recently been studied for Gaussian inverse problems in [39|, 1401 147j , 
and motivated by this development we consider hierarchical Gaussian models. More 
precisely, we introduce a hierarchical solution method and analyze its properties with 
arbitrarily fine discretization. 

The inverse problem we consider is the linear signal restoration problem where 
the measurement m(t) relates indirectly to the unknown signal u(t) via 

(1.1) m{t)=Au{t) + e{t), teT. 

Here, T is the unit circle which we frequently consider as the interval [0, 1] with the 
end points identified. Furthermore, A is a smoothing linear integral operator, and 
e{t) is random noise. The signals are considered on the unit circle T to avoid the 
complicated boundary effects that fall outside the scope of this paper. 

In the Bayesian approach u(t) and e{t) are modelled as random functions. Let 
us denote by U{t,Lu) and £{t,uj) random functions where a; € £7 is an element of a 
complete probability space (il, S, P) and t e T. The distribution of U{t, lu) and uj) 
model our a priori knowledge on the unknown signal u(t) and error e(i), respectively, 
before the measurement is obtained. Below, the variable uj is often omitted. The 
ideal measurement is considered to be a realization of the random function M(t) = 
AU{t) -f £{t) on t G T. In Bayesian inversion the aim is to make statistical inference 
on U given a realization m of the random function M, and the Bayesian solution 
to the inverse problem means finding the conditional probability distribution of U , 
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called the posterior distribution, or some estimates for this distribution. Typically 
studied point estimates are the expectation of the posteriori distribution called the 
conditional mean (CM) estimate and the maximum point of the posterior density 
called the maximum a posteriori (MAP) estimate. 

In Bayesian inversion a reconstruction method is said to be edge-preserving if the 
functions u which have high probability with respect to the posterior distribution are 
roughly speaking piecewise smooth and have rapidly changing values only in a set 
of small measure. In the finite-dimensional Bayesian inversion theory a number of 
methods have been introduced for obtaining edge-preserving reconstructions [T^l [531 
[29l [51] . In this paper the prior distribution of the random function U has a Gaussian 
distribution such that its covariance depends on an auxiliary random function V. 
Moreover, the random function V has a Gaussian distribution. Such a model is 
called hierarchical Gaussian model. With a fixed discretization similar models have 
been studied in inverse problems in [ST]. Furthermore, in the work by Calvetti and 
Somersalo [171 HI] hierarchical methods have been used for image processing problems 
to obtain edge-preserving and numerically efficient reconstruction algorithms. We 
also mention that the edge-preserving reconstruction methods have been extensively 
studied in the deterministic problem setting, see e.g., [HI EH HSl |49l|56l|58l [59] . Our 
main result in this paper connects computing the MAP estimate of a hierarchical 
Gaussian model to the minimization of the Mumford-Shah functional [46j used in 
image processing. As a byproduct we also present new results concerning the existence 
of a minimizer of the Mumford-Shah functional. 

Let us next discuss the discretization of Bayesian inverse problems. Above we have 
considered U{t) and M(t) as random functions defined on the unit circle. For any 
practical computations such models have to be discretized, i.e., to be approximated 
by random variables taking values in a finite dimensional space. Roughly speaking, 
a Bayesian model is said to be discretization invariant, if for fixed model parameters 
it works coherently at any level of discretization. For an extended discussion on the 
discretization invariance and the relation of the practical measurement models and 
the computational models considered below, see [32] • 

In the ideal model the noise £ can be considered as a background noise. In this 
paper we will further assume that the practical measurement setting produces an 
additional instrumentation noise. More precisely, we assume that the practical mea- 
surement can be modelled as a realizations of a random variable Mk = PkM + Ek, 
k = 1,2, 3..., where operator Pfc is a finite dimensional projection. The random vari- 
able Ek describes the instrumentation noise and it takes values in the range of Pk- 
Increasing the number k corresponds here to the case when we make more or finer 
observations of the ideal measurement signal M{t). Moreover, in practical computa- 
tions also U needs to be approximated by a finite dimensional random variable J7„ 
which leads us to consider the computational model 

(1.2) Mkn^AkUn+£k 

where k,n € N are parameters related to discretizing the measurement and the un- 
known, respectively. In equation (|1.2p we have = P^A and is a random variable 
in range of P^ satisfying 

(1.3) £k=PkS + Ek. 

In developing new Bayesian algorithms, it is important to study if the posterior 
distribution given by problem (|1.2p or some preferred estimate converges when k or 
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n increase. This question is often non-trivial. For example, for the total variation 
prior it is proven in [IT] that the MAP and CM estimates converge under different 
conditions as discretization is refined. Moreover, if the free parameters of the discrete 
total variation priors are chosen so that the posterior distribution converges, then the 
limit is a Gaussian distribution. Hence, the key property of the total variation prior 
is lost in very fine discretizations. This example illustrates the difficulty involved in 
discretizing non-Gaussian distributions. Also in this paper we will observe that the 
convergence of the MAP and CM estimates occurs in different cases. 

Let us next formally define the discrete models we study. Set N — 2" and let 
points tj = j /N , j = 0, 1, N, and to identified with In, denote an equispaced mesh 
on T. We define PL{n) to be the space of continuous functions / G C(T) such that 
/ is linear on each interval for < j < N. Furthermore, let PC{n) be the 

space of functions / € L^(T) such that / is constant on each interval for 
< j < iV. Denote by Qn : i^(T) PC{n) the orthogonal projections with respect 
to L^(T) inner product and let Q = Qo he the projection to constant functions. 
Define the operator Dg ^ D + e''Q : H'^{T) L'^{T) where e > 0, g > 1 and i:) is the 
derivative with respect to i g T . 

The hierarchical structure is defined in two steps. First, let Vn,e be a Gaussian 
random variable in PL(n) with density function 



where v E PL{n), a E M. and N is the number of mesh points. Here and below c is 
a generic constant whose value may vary. Then choose w„^e to be a sample of Vn^e- 
The random variable Un,e, conditioned on fn,e, is then defined as a Gaussian random 
variable on PL{n) with density function 



where u G PL{n). Note that the constant c' depends on Vn,i- Since the density 
function presentation in finite-dimensional Hilbert spaces is non-standard, we give 
in Section 2 a definition of random variables Un,<L and Vn,t based on the coordinate 
representation. 

Roughly speaking, the sample Vn,e has a high probability if it varies from 1 only 
little and this variation becomes less smooth if e is decreased. A sample of Un,t has 
a high probability if it varies rapidly only near the points where Vn,t is close to zero. 
Hence the role of parameter e > is to control how sharp jumps Un,i can have and 
consequently, we call it the sharpness of the prior. Furthermore, the parameter a 
describes the scaling of the prior information. The bigger the value of a the more 
concentrated the prior distribution. 

In consequence of the construction above the probability density of the joint 
distribution of (J7n,e, T^n,c) has a form 



(1.4) 




(1.5) 





where (u, v) £ PL{n) x PL{n) and 
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(1.6) + \Qnv\'')\D,u\'' + el^^P + 4^|1 - ^'I'jrf^- 

The logarithmic term in (|1.6p appears due to the fact that the normahzation constant 
c' in (|1.5p depends on Vn,e- It turns out that the functional F"„ is closely connected to 
the widely studied segmentation method in deterministic image and signal processing, 
namely, the Mumford-Shah functional 

(1.7) F{u,K)= / \Du\^dt + i,{K) + / \Au-m]^dt 

Jt\k Jt 

with respect to function u and the set K of the points where u jumps [46] . The 
notation jj(-ftr) stands for the number of points in K. This functional is known to 
be difficult to handle numerically and a number of approximations to the variational 
problem of minimizing (jl.7p have been introduced. In ^ ^Tj it is shown that the 
Mumford-Shah functional can be approximated by elliptic functionals in the sense 
of F-convergence. These Ambrosio-Tortorelh functionals are the key element in our 
presentation. 

Let us describe our main results. We study the behaviour of the MAP estimate 
in the case when the discretization parameters k and n are coupled. For the sake 
of presentation we assume k — n and drop k from the notations. Furthermore, we 
assume that £„ is white noise with variance depending on n and scahng parameter 
K. More precisely, 5„ is a Gaussian random function on T taking values in Ran(P„) 
with zero expectation and covariance 

(1.8) E((£„,0)i2(£„,V')LO =^""('^,V')l= 

for any (/i, "0 G Ran(P„). Notice that consequently E||5„||^2 = N^~'^ and the choice 
of K describes how the norm of the noise is expected to behave asymptoticaUy. We 
emphasize that the case when n > I corresponds an assumption that more measure- 
ments produces better accuracy expectation whereas with k = 1 one assumes that the 
accuracy in the norm of L^(T) is expected to stay stable. An example of the case k> 1 
is when the background noise £ is negligible and the instrumentation noise follows 
asymptotics (|1.8p . The case k = corresponds to the discretization of the Gaussian 
white noise, see [12]. To be able to prove positive results for the convergence of the 
MAP estimates we will assume 

(1.9) K = a. 

This imphes that the scaling parameter of the prior distribution is determined by the 
variance of the noise in discretized measurements. The case when (|1.9p is not vahd is 
discussed in Remark [TJ Due to the equality (jl.9p we drop the notation k and use a 
as the scaling parameter of the noise distribution. 

Under these assumptions the MAP estimate for (C/„,e, T4,,e) corresponding the 
measurement m„, hm„^oo m„ = m in L^(T), is a minimizer 

{'^■'^0){unf^,v^^/^) G argmin(„ „)gpi(„) xPL(n) {KA^^''^) + || A„u - to„||^2) • 

In the Theorems 13.31 and 13.41 we prove for the MAP estimates: 

(a) For a — the minimization problems (|1.10p diverge as n — > oo. 
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(b) For a > 1 the MAP estimates {un^f''^,Vn^f'^) converge to the minimizer, 
denoted (^uf^^^ ,vf'^^^^ , of a perturbed Ambrosio-TortoreUi functional as 
n — > oo. Moreover, the functions (^uf^^^ ,v^^^^^ are shown to converge up 
to a subsequence to the minimizer of the Mumford-Shah functional (|1.7p as 
0. 

In "M] and Remark [5] the following is shown, with slightly different assumptions on 
operator A, for the convergence of prior distributions and the CM estimate 

(a') For Q = the random variables {Un^e,Vn,e) converge in distribution on 

i2(T) X i2(T) and the CM estimates {U^f, 14^*0 converge in ^^(Tr) x L'^{T) 

as n oo. 

(b') For a > 1 the random variables Ki.e) converge to zero as n — s- cx). 

The type of convergence in (b') is discussed in Remark [51 Consequently, the results 
(a),(b) and (a'),(b') illustrate how the convergence properties of the MAP and CM 
estimates are different for hierarchical Gaussian models. 

Let us recall that the CM and MAP estimate coincide for finite dimensional 
Gaussian inverse problems |38i- Typically the MAP estimates are computationally 
faster to obtain than the CM estimates and thus in inverse problems close to Gaussian 
ones the MAP estimate is used as an approximation of the CM estimate. The above 
results show that this is not the case for the hierarchical Gaussian models in general. 

Finally, let us consider the current perspectives to Bayesian modelling and how 
this paper connects to earlier work. Bayesian inversion in infinite-dimensional function 
spaces were first studied by Franklin in 28J. This research has then been continued 
and generalized in [571 1131111] • The convergence of the posterior distribution is studied 
in [34l HOI [42l [50] . In relation to result (b) the convergence of posterior distribution 
is studied in [351 ISSl HZ] when objective information becomes more accurate with 
Gaussian prior and noise distributions. For a general resource on the Bayesian inverse 
problems theory and computation see [191 138] . For non-Gaussian noise models in 
statistical inverse problems see [33]. The Mumford-Shah functional has been applied 
to inverse problems for instance in [52] [53] [54] and for related work in image processing 
problems see [lOl [TBI HOI [22] . Finally we mention that variational approximation with 
F-convergence is used earlier in the context of inverse problems in e.g. [30, 41] (531 [54]. 

This paper is organized as follows. In Section 2 we introduce the stochastic model 
and necessary tools to tackle the convergence problems related to MAP estimates. 
Section 3 covers the main results and the proofs are postponed to Sections 5 and 
6. In Section 4 we study the existence of MAP estimates and Section 5 discusses 
the cases when desired convergence does not take place. In Section 6 the proofs 
related F-convergence and equi-coerciveness of the functionals. Finally, in Section 7 
we illustrate the method in practise by numerical examples. 

2. Definitions. In this section we cover the stochastic model introduced in [31] 
and furthermore give the main tools and theoretical results concerning the variational 
problem of the MAP estimate. Let us first introduce some notation. Most function 
spaces in our presentation have structure of a real separable Hilbert space. We often 
use the L^-based Sobolev spaces H'^{T) for any s € R equipped with Hilbert space 
inner product 

{cb,^)Hs / {{I - Ar/^cf>)mi - ^r^'m)dt 

Jt 

,2 

for any (t),ip £ iJ*(T) where A — However, we also study the Banach structure 
of i/*(T) with dual space H^'^{T). In this setting the Banach dual pairing is denoted 
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by {■,-)h-'xH'- We also denote H'^{T; [a,b]), s > 0, for functions / G iJ''(T) such 
that a < f < b a.e. for a, 6 e M. Furthermore, we discuss spaces H^{a, b) = {f \ f = 
5l(a,b),3 € i?''(M)} for a,b,s G R. We say that a sequence {xj}°^i converges to x 
strongly in Banach space X, if lim^-^oo — ||^ = as j ^ oo. 

Recall from Hilbert space valued stochastics [IT] that a covariance operator Cx 
of a Gaussian random variable X : Q H is defined by equality 

Ei{X -EX, cb)H{X -EX, ^b)H) = {Cx(I),tP)h 

for all 4>,tl> G H. We call a Gaussian random variable centered if EX = 0. 

A 

dt 



We use a perturbed derivative Dq = D + e'^Q, where D = 4: and 



{Qfm= / fitvnm 



for 1(<) = 1 and any / G L^{T). This construction guarantees that Dq : H^{T) 
i^(T) and £'5|pL(n) ■ PL{n) PC{n) are invertible mappings. 

2.1. Bayes modelling. Let us now shortly describe how we define the Bayesian 
maximum a posteriori estimate for the computational model given in equation (|1.2p . 
Let (Hi, {■,■)!) and {H2, (■, •)2) be two real Hilbert spaces such that dimiJi — J and 
dimJf2 = K. Assume that [/„ obtains realizations in a Hi and the range of the 
measurement projection Pk is H2. Furthermore, let I : Hi ^ R'^ and J : H2 R^ 
be two arbitrary isometrics. Let us now map equation (|1.2p to a matrix equation 

(2.1) Mfe„ = JMkn = Afe„U„ + Efc 

where Afe„ JAkl^^ e M^^"', = F^A, Efc = J^£;fc and U„ = X;7„ : n . If 

the a priori and likelihood distributions above are absolutely continuous with respect 
to Lebesgue measure the posteriori distribution can be obtained by the Bayes formula: 
the posteriori density 7r/j„ then has the form 

(o o\ I ^\ n„(u)rfc„(m I u) 

(2.2) 7rfc„(u I m) = 



Tfen(m) 

where u e R'^ and m e R^. In equation (|2.2p functions n„ and Vkn are the prior and 
likelihood densities, respectively, and T„ is the density of M/;„ [3S]. The standard 
definition of the maximum a posteriori estimate is then 

AT AP / I \ 

Ufc„ e argmaXugR„7rfc„(u | m) 

where the set on the right-hand side consist of all points u maximizing T^kni' I m)- 
The value of 



u 



.MAP _ 7—1 (MAP\ ri TJ 



is commonly defined as the MAP estimate of problem (|1.2[) . Another point estimate 
in Bayesian inversion is the CM estimate which is defined as the integral 



CM 



/ U7rfe„(u I m)du and u^^' =1'^ {n'^.^') G Hi. 



We note that although the posterior density depends on the inner products (•,•)! 
and (•, •)2 both point estimates are invariant with respect to such choices. For more 
information about the point estimates see [32] for CM and [3T] for MAP estimation 
in Hilbert spaces. 
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2.2. The prior model. In this subsection we introduce the prior model dis- 
cussed in the introduction and explain the meaning of the density function represen- 
tation in equations (jl.4p and p.Sp . Let us first review the infinite dimensional prior 
model introduced in [34]. Consider a centered Gaussian distribution A^' on L^(T) with 
covariance operator 

Cu{v) = LA{v)L* : L^{r) -> L^{T) 

with L = D-^ : L'^{T) L'^{T) and multiplication operator A{v) : L'^{T) L'^{T) 
defined as 

iHv)fm = ■ fit), t€T, 

for any v ^ L? (T) . Let us now formally discuss the qualitative behavior of . Such 
a distribution has the property that in a set of < G T where v{tY is large the samples 
from distribution A"" are likely to be smooth. Vice versa, in sets where is small 
the distribution allows more rapid changes. 

Next we set the prior distribution of random variable U to be A". However, the 
crucial step in hierarchical modelling is to model the values of v with a random variable 
V . Thus, instead of knowing the exact locations of the jumps, we model how they are 
distributed. In [34 the random variable V is Gaussian with expectation WV = 1 and 
covariance operator Cy = { j^I " e^) on L^(T). Denote the distribution of V on 
L2(T) by y. The joint distribution of the random variable ([/, V) : Vt ^ L'^{T) x L'^[T) 
is then defined to satisfy 

(2.3) X{ExF)= [ K[E)du{v) 

Jf 

for any Borel measurable sets E,F C L'^{T). This construction is shown in [33] to be 
well-defined. 

In the following we define the finite dimensional prior structure studied in this 
paper with all scalings a G M. In [34 these random variables are shown to converge 
to U and V in distribution on L^(T) x L^(T) when a = 0. First define two inner 
products on H^{T), namely, 

(2.4) {f,g),:={D,f,D,g)L^ and {f , g) 2 := {Cy'^ f , Cy'^ g) 

for any f,g^ H^{T). Next construct two orthonormal basis {fj}J^i, {ffjl^^i C H^{T) 
with respect to inner products and {■,-)2y respectively, in the following way: 

for any n e N we have {^jj^Li C PL{n) where N = 2", n e Z+. Such 

a construction can be obtained, e.g., using the Gram-Schmidt orthonormalization 
procedure. To simplify our notations we assume that the probability space (51, S,P) 
has the additional structure f7 = fii x 0,2, S = Ei (g) S2 and P = Pi (K) P2- We denote 

W = {Ldi,ijJ2) e fil X Vl2- 

Definition 2.1. Define F„^, : 0.2 -> PL{n) c ^^(T) as 

N 

v:,A^2)^Y.^lNM)9, + r 
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where the random vector V" ^ — (V"^ is)j^i ■ ^2 ^ a centered Gaussian 

random variable with covariance matrix Cv^ ^ — N^^I € R^^-'^. 
Definition 2.2. Let C/"^ : L^(T) be the random variable 

N 
J = l 

where the random vector U" ^(w) = i^'j n e)j'=i ^ given the following structure: 

Denote by L02 1— > C(a;2) G R^^"'^ a random matrix such that 

Due to the positive definiteness of C we can define 

VZAu;) = C(c^2)^W^(u;i) 

where : fJi ^ R^ is a centered Gaussian random variable with identity covari- 

ance matrix. 

Following the procedure shown in Section [2TT] choose Ii,l2 : PL{n) R^ to be 
two isometries with respect to inner products (•, •)i and (•, ■)2, respectively, and with 
the usual inner product of R^. Clearly, it follows that the vector presentation is then 

U,%=XiC/,?,, and V,^_,=X2(K:.-1)- 

In '34] it was shown that if u,w € PL{n) are arbitrary and u — v — JnV G R^ 
then it holds that the probability density function of VJJ ^ in R^ is 

(2.5) nv= ^(v) = cexp [e \\Dv\\l. + ^^W^ - 

and the conditional probability density function of ^ in is 
nu°_,|v° ^(u I v) 

(2.6) cexp (^-^ (^^ (-7Vi-«log(e2 + (Q„„)2) + (^2 ^ (q„„)2)|£)^^|2) ^^^^ 

With the same assumptions the joint prior density then takes the form 

/ TV" 

n(US,,,V°_j(u,v) = nu=_JV° ,(U I V) • nv-^(v) = cexp ( ^F"n(W'") 

where the functional is given in the following definition. 

Definition 2.3. For any e > Q, n c^n, and a c^M. let F,"„ : ffi(T) x ffi(T) 
R U {00} be functional such that 



+ (e' + {Qnv)^)\D,u\^ + e\Dv\^ + 1(1 - vf^dx 

when {u,v) e PL{n) x PL{n) and F^^{u,v) = 00 when {u,v) e {H\T) x H^(T)) \ 
{PL{n) X PL{n)). 
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2.3. Variational approximation and the functions of bounded variation. 

In this section we recall the definition and some important properties of F-convergence 
and the functions of bounded variation. The concept of F-convergence was first in- 
troduced by De Giorgi in the 1970's. For a comprehensive presentation on the topic 
see [26]. Let X be a separable Banach space endowed with a topology r and let 
G, Gj -.X^ [-00, oo] for all j G N. 

Definition 2.4. We say that Gj T-converges to G for the topology r and denote 
G = T—\imj^ocGj if 

(i) For every x ^ X and for every sequence xj r-converging to x in X we have 
G{x) < liminfj^oo Gj{xj). 

(ii) For every x £ X there exists a sequence Xj T-converging to x in X such that 
G{x) > limsupj^,,^ Gjixj). 
Note that an equivalent definition is obtained by replacing condition {ii) with 

(ii') For every x £ X there exists a sequence Xj r-converging to a; in X such that 
G{x) = limj^oo Gj{xj). 

Definition 2.5. A functional G . X —t [— oo,cx3] is said to be coercive if con- 
dition linij^oo ll^^jllj^f = oo implies limj^oo G{xj) = oo. We call a sequence of func- 
tionals Gj : X [—00,00], j e N, equicoercive in topology r if for every t>0 there 
exists a compact set Kt <Z X such that {x £ X \ Gj{x) < t} d Kt for all j G N. 

The following theorem summarizes some of the known results regarding F-conver- 
gence. For proofs see [55]. 

Theorem 2.6. Let G,Gj : X — )■ [—00,00], j e N, be a sequence of equicoercive 
functionals in topology r and G = T—limj^acGj. Then the following two properties 
hold: 

(i) If the T -limit of Gj exists, it is unique and lower semi- continuous, 
(ii) For any continuous H : X R we have G + H = F— limj^oo(Gj -|- H). 
(Hi) Let Xj £ X be such that \Gj{xj) — 'mix^x Gj{x)\ < 5j where dj 0. Then 
any accumulation point y of {xj}JLi G X is a minimizer of G and moreover 
limj^oo Gj{xj) = G{y). 
Notice the immediate corollary to {iii): suppose the assumptions in Theorem l2.6l 
hold and Xj is a minimizer of Gj for j G N. Then any converging subsequence of 
{xj}JLi C X converges to a minimizer of G. 

Let us now turn to the related function spaces. Let m : T — > M be a measurable 
function and fix t G T. We say that z G M U {00} is the approximate limit of m at t 
and write z = aplim^^ju(s) if for every neighbourhood T of z in R U {00} it holds 
that 

lim i |{s G T I |s - i| < p, u{s) i T)\ = 0. 

p^O p 

We use notation u{t) = a,Y>\\Jn^_^^u{s) when the limit exists. Denote the set of points 
f G T where the approximate limit does not exist by Su- When u G L^(T) it follows 
that \Su\ = 0, see [S]. 

Denote by BV{T) the Banach space of functions of bounded variation. A function 
u belongs to BV{T) if and only if u G L^{T) and its distributional derivative Du 
is a bounded signed measure. We endow BViT) with the usual norm [juH^y = 
-|- |Dm|(T) where \Du\ is the total variation of the distributional derivative. 

Recall that due to the Lebesgue decomposition of measures the distributional 
derivative Du can be written as a unique sum Du ~ D'^u + D'^u where D'^u is 
absolutely continuous and D^u is singular with respect to Lebesgue measure | • |. 
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Denote the density of D'^u with respect to the Lebesgue measure by Vu. We call 
function Vm the approximate gradient of u. Moreover, denote D^u = D'^u\s^ and 
W^u = D'^u\j\s^ where we have used notation fj,\x{Y) = ^{X D Y) for measurable 
sets X^Y C T. These restrictions are called the jump part and the cantor part, 
respectively. We say u G SBViT) or it is a special function of bounded variation if 
u S BV{T) and D'^u = 0. Furthermore, denote by GSBV{T) the Borel functions 
u : T ^ M that satisfy min(fc, max(ii, -fc)) G SBV{T) for all fc G N. The space 
GSBV is called the space of generalized special functions of bounded variation. 

It turns out that the generalized special functions of bounded variation inherit 
most important features of SBV functions. First of all the set Su is well-defined and 
enumerable for u £ GSBV(T), and the approximate gradient Vm exists almost every 
point in T. We refer to [5l [13] for a detailed presentation on these properties. 

2.4. Mumford Shah and Ambrosio TortorelH functionals. The idea of 
the weak formulation of Mumford-Shah functional is to use the function space GSBV 
as framework for the minimization problem and identify the set of jumps K in (|1.7p 
with the set Su defined above. Let us drop the residual term from functional (|1.7p for 
the moment and denote 



The role of the auxiliary function v becomes clear later. The regularization term 
MS has been widely used in problems related to image segmentation problems. The 
application to ill-posed problems has been less extensive since in general with non- 
invertible forward operator A the compactness of any minimizing sequence is not 
known. For the inverse conductivity problem in [53l [54] the compactness is obtained 
by posing an a priori assumption that the minimizers are bounded in L°° . In Section 
[4| we prove a compactness result without such an assumption for mildly ill-posed 
problems. 



Next we define the Ambrosio- Tortorelli functionals [6, 7 . First denote X = 
i?i(T) X i?i(T; [0, 1]) and the regularizing term 



for (u, v) £ H^{T) X H-^{T). A comprehensive proof for next theorem can be found in 
[13] when p = 1 and [21] for the case p = 2. 

Theorem 2.7. (Ambrosio-Tortorelli) Following statement holds for p — 1 
and p = 2. Define functional F^^ : L^iT) x iP(T) (— oo, oo] so that 



Then we have that F-lime^oi^e^^ = MS m the strong topology of LP{T) x iP(T). 

3. Main results. Let us now return to the computational model (|1.2[) and the 
prior distributions introduced in Section [221 For the results shown in |34j no depen- 
dence of the discretization parameters k and n is assumed. However, in this paper we 
need to require that k and n are coupled, i.e., the discretization can be characterized 
with only one parameter (k — k{n) and lim„^oo k{n) = oo). For the sake of clarity in 




CO otherwise. 



/j, iVupdx + l{Su) if u e GSBV{T) and v = 1 a.e., 



(2.7) 





when {u, v) € X, 
otherwise. 
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the following we assume k — n and hence we drop the notation k. Furthermore, the 
computational model (|1.2p becomes simply 

(3.1) Mn^AnUn+£n. 

Before stating the assumptions concerning the Bayesian inverse problems p.ip for 
n e N let us first introduce a definition similar to the one used in [331 112] ■ 

Definition 3.1. The finite dimensional measurement projections Pn, n€N, are 
called proper measurement projections if they satisfy following conditions: 

(i) WehaveJia,ii{Pn) CI H^{T) and it holds that \\Pn\\^f^jji^ < C and ||P„||£^^2-) < 
C for some constant C with all n £ N. 

(a) For t € {-1,0, 1} we have lim„^oo \\Pnf - /H^t = for all f £ H^T). 

(Hi) For all 0, "0 G i^(T) it holds that {Pn4>,%l)) 1,2 — (0, P„'i/;)^2 . 

For a discussion about the assumptions regarding the measurement see [42! . In 
Section [7] we provide an example of projections P„ that satisfy Definition 13. II 

Assumption 1. For the problems in equation iS.l]) there exists proper measure- 
ment projections Pn, n G N, and fixed parameters a € R, e > 0, and s > such 
that 

(i) there exists a hounded linear operator A : L'^{T) L^(T) which satisfies 
(3.2) \\u\\h^.<C\\Au\\^. 

for any u G L^(T) with some constant C > and An — PnA for all n C^M. 
(a) The additive noise £„ is a Gaussian random variable in PL{n) such that 
E£„ = and for any £ _L^(T) the covariance satisfies 

^{£n , 0) L2 {£n , = iV-" (F„0, Pn^) • 

(Hi) The prior structure is modelled with random variables {Un ^,V"^) introduced 
in Section \2.2l 

(iv) The measurements m, m„ G L'^(T), n G N, satisfy lim„^oo "rrin — m in i^(T). 

Notice that condition (ii) means that £„ has white noise statistics and in case 
a = the random variables £n convergence to white noise in the sense of generalized 
random variables as n 00 [42ll55j . Now with Assumption [1] the variational problem 
of finding the MAP estimates for equation p.ip becomes 

(3.3) niin(„^„)gpL(„)xPL(n) (^F"„ {u, v) + \\PnAu - m„||^2) . 

Below we study the behaviour of the MAP estimates with respect to parameters 
n, a and e using the variational approximation methods presented in Section 12.31 In 
order to describe the F-limits of the functionals in equation (|3.3p we have to introduce 
some new notations. 

Let us first denote an auxiliary domain 

(3.4) X, = H\T) X H\T;[0,l + 30e]) 

for sufficiently small e. Details about this choice of domain are given in Appendix 
lA.ll and we discuss it in more detail below. For now, it suffices to point out that the 
domain formally approaches X when e decreases. Denote the auxiliary operators 

L,{v) = [ ^\og{e^ +v^)dt 
Jt 
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and 

S^{u, v) = I {e^ + v'^){2e'i{Qu)Du + e'^^Quf)dt 



for {u,v) e iJ^(T) X iJ^(T). We motivate these notations after the next definition. 
Recall that X denotes the domain H'^{T) x iJi(T; [0, 1]). 

Definition 3.2. Let us define Junctionals : L^(T) xL-^(T) (— oo,cxd], e > 
and a> \, so that for a = 1 

pi/ \ — I Lf{u,v) + ATf{u,v) + S^{u,v) when {u,v) £ X^, 
1 oo otherwise. 



and for a > 1 



AT^{u, v) + S'f (w, v) when (it, v) e X, 
oo otherwise. 



Let us now discuss Definition 13.21 The reason for the particular choice of X^^ 
is two-fold. First, it turns out that the minimizer of functional + AT^ + 5*1 in 
H^iT) X H^{T) may be located outside X. Secondly, a pointwise bound for v provides 
easier proofs concerning the F-convergence results of functionals F" in Section [51 

Furthermore, it is straightforward to see that 

(3.5) AT,{u,v) + S^{u,v) = J (^{e^ + v^)\Dquf + e\Dvf + ^^{1 - vf^ dx 

everywhere in H^{T) x H^{T). Hence the role of can be understood as a small 
perturbation that yields a lower bound for \Qu\ and thus coersivity for F^. On the 
other hand, compared to the Ambrosio-Tortorelli approach, a new term appears 
due to the Bayesian hierarchical modelling. 

In addition to problem (j3.3p . we will consider three different minimization prob- 
lems throughout the paper. Two of them are the modified Ambrosio-Tortorelli mini- 
mization problem 

(3.6) min F"{u,v) + \\Au ^ mll^ 
and the Mumford-Shah problem 

(3.7) min MSiu.v) ^^ R(u). 

(ti,t))eLi(T)xLi(T) 

In (|3.7p we assume that A : i''(T) L''(T) is continuous for -p G {1,2} and the 
residual K{u) is defined by 



R{u) = 



||Aw-m||^2 when Au£l?{T), 
oo otherwise. 



In the following we often use notation — to||^2 for R{u) when convenient. To 
describe cases when the edge-preserving property of MAP estimates is lost asymptot- 
ically we consider the Tikhonov-type minimization problem 



(3.8) H{u) 



\.^\Du\^dt+\\Au-mfi^2 when u e i7i(T), 
» otherwise. 
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Notice that the solution to problem p.Sp is obtained by Umin — + A* A)^^ A*m. 
With the definitions given above we are ready to state the main results. Denote the 
conditional mean estimates introduced in Section [^TT] of problem (|3.ip by (u^*^, w^^). 

Theorem 3.3. Let the computational model iS. 1]) satisfy Assumptions^ with 
prior parameters s > 0, a — 0, and e > and let the operator A : L'^(T) H^{T) be 
bounded. Then the following statements hold: 

(i) The CM estimates {u^'^/ ,v^^) converge in _L^(T) x L'^(T) as n oo. 

(a) The MAP estimates {u^^f^,v^^^^) diverge as ti —> oo. 
In addition, the following holds for coupled parameters: 

(Hi) If e = e{n) — > as n oo then it follows that either the minimum values in 
formula 113. 3\) diverge to —oo or the MAP estimates (u^^^l^yV^^^-^) converge 
towards a minimizer of the functional li3.8\) . 

The statement (i) of Theorem 13.31 is proved in '34] and statements (m) and (Hi) 
are proven in Section[5l Notice that even the coupling of e and n in statement (Hi) do 
not yield convergence to Mumford-Shah minimizers. Namely, the diverging minimum 
values immediately contradict with condition (z) in Definition 12 .41 since functional MS 
is positive. Furthermore, the convergence to a minimizer of functional (j3.8p implies 
that the edge-preserving property of the MAP estimates is lost. We point out that 
statement (iii) does not imply that this property is lost with all couplings. 

Our main positive result regarding the convergence of the MAP estimates is the 
following. 

Theorem 3.4. Let the computational model US. 1\) satisfy Assumptions^ with 
prior parameters s < ^ , a > 1, and e > and let the operator A : ^^(T) — > LP{T) be 
bounded for p — 1,2. Then 

(i) The MAP estimates {u^f^,v^J^^) have a subsequence converging to a min- 
imizer {ue,Vc) of problem i3.6\) in the weak topology of H^{T) x H^{T) as 
n — > oo. 

(a) The minimizers {ue,Ve) € H^{T) x H^{T), e > 0, of the problem iS.6\) have 
a subsequence converging to a minimizer of the Mumford-Shah problem |j'.7| ) 
m Li(T) X L\T) as e -> 0. 
The result (ii) in Theorem 13.41 can be also considered as a new interpretation of 
the Mumford-Shah functional; the minimizer of the Mumford-Shah functional can 
be approximated by the MAP estimates of Bayesian inverse problems. The proof for 
Theorem 13.41 is given in Section [6l 

4. Well-posedness of the minimization problems. In this section we study 
the properties of the individual problems ()3.3p . (13. 6p and p.7p with fixed parameters 
e,a and n. Our aim is to show three results. First, the existence of a minimizer of 
problem (|3.7p is proved in Theorem l4.3l Second, we show in Lemma |4?5] that with the 
choice of domain we do not exclude any pairs {u,v) G H^(T) x H^{T) which give 
a smaller value in the problem (|3.6p . Finally, we show that functionals and 
are coercive in i7^(T) x i7^(T) which yields the existence of minimizers in problems 
(1231) and (EH). 

Let us now study the existence of solution to problem p.7p . The following com- 
pactness and semi-continuity theorem in GSBV is well-known. 

Lemma 4.1. Suppose a sequence {uj}'^^ C GSBV{T) satisfies 



(4.1) 



\W,\\L.+t{Su,)+ I \Vu,\'dt<C 
Jt 
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for some 1 < p < 2. Then there exists u G GSBV(T) n LP(T) and a subsequence 
{mjj.}^-^ such that 

(''') '^jk ~^ ''^ strongly in L^{T), 
(ii) "Vujf. Vu weakly in L^(T) and 
(Hi) tJ(5„) < liminffc^oo tK'S'ujJ- 

Proof. If C GSBV{T) satisfies (gH]) then also 



(4.2) 



-'J 1 1 LP 



l^ujl^dt < C. 



By [nulls] (see [25l Thm. 2.1.] for a short exposition) it holds that there exists a 
subsequence {w^fclfcLi and u e GSBViT) n L^'(T) such that conditions (z) and (zi?) 
are satisfied. Furthermore, since Vwj is bounded in i^(T) due to the Banach-Alaoglu 
theorem we can extract a subsequence such that condition (ii) holds. □ 

Next lemma shows that the assumption in Lemma |4. II is in a sense self-improving 
and one can extend it for the purpose of mildly ill-posed problems. 

Lemma 4.2. If a function u G GSBV{T) n L\T) satisfies 



(4.3) 



tt(5„) 



for some < s < ^ then for p > 1 such that ^ = s + ^ it also satisfies inequality j| ) 
for some constant G depending on s and G' . 

Proof. Let us denote by tj the points in Su, such that Su = {ii, ^2, ^lIj where 
ti < t2 < ... < <L siiid i = ti('S'u) is bounded. Furthermore, denote by Ij = (ij_i,tj) C 
T, 1 < j < L, the interval between neighboring points. Here tg and were identified. 
We can estimate the average of u over interval Ij by 



1 



Ukdt 



<C\Ij\-^-'\\u\\^^.^,^y 



where we have used Lemma IA.41 Now the Poincare inequality states that 



7—7 / udt 



<C|/,|||Vu| 



and we obtain 
deduce that 11 u 



< C(|/j | + 1). By using the knowledge Y^^=i Vj 



= 1 we 



LP(T) 



< C" where the constant C" depends only on s and C". This 



proves the claim. □ 

Clearly any sequence Uj satisfying inequality ()4.ip belongs to L°° (T) and thus also 
SBViT). However the bound in (14. ip does not control this norm and hence without 
any additional bound in L°° the limit does not necessarily belong to SBVi^). As the 
existence of a Mumford-Shah minimizer has interest for inverse problems in general 
we have formulated an independent proof to the following theorem. 

Theorem 4.3. Let A be a bounded linear operator in L^iT) for p —1,2 such 
that it satisfies inequality iS.S^) for some s < ^. Then the minimization problem 



(4.4) 



inf iMSiu,l) + Riu)) 

'iiGL-'- (T) 



has a solution u £ GSBV{T) n ^^(T). 
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Proof. Assume that {uj}°°^^ C L^(T) is a minimizing sequence, i.e., 
inf {MS{u, 1) + R{u)) = lim {MS{uj, 1) + RiuA) . 

Then the sequence Wj satisfies inequahty (j4.3p which in turn yields conditions in 
Lemma |4. II Consequently, we may extract a subsequence converging in L^(T) to 
u G GSBV{T) n L^{T). Notice that the residual term R{u) is lower semicontinuous 
in the L^{T) topology. Denoting the infimum in (j4.4p by 2 we obtain using Lemma 
mi] that 

I < MS{u, 1) + R{u) 

< lim inf MS{uj^ 1 1) + 1™ inf ^{"^jk ) 

fc — )-oo — >oo 

<hminf(M5K,,l) + i?(^.,J) <X. 

A: — *oo 

The claim follows from (u, 1) being a minimizer. □ 

Next we discuss the choice of domains X and in Definition 13.21 Denote by 

tprj'^r ■ K ^ [0;f], r > 0, the functions ^rit) = (r — \r — t\)x[o.2r]it) 

(4.5) vl/,,(<)^^^,(i_2jr). 

We notice that for any function / and r > the mapping ^'r o / satisfies < 
(^r ° /)(i) < for all t gT. Due to such property we call this operation folding. We 
list the following three properties of ^'^ as a lemma. 

Lemma 4.4. For any f £ iJ^(T) it holds that 
(i) \[-^rO f){t)\ < \f{t)\ foranyteT, 

(ii) \r-{^>,o f){t)\ < \r-f{t)\ and 

(Hi) |-D(^',- o /)| = \Df\ almost everywhere on T. 

Proof. The first claim is obvious since = sgn{f{t))\f{t)\ when f{t) g 

[— r, r] and also < ^',.(/(t)) < r for any t E T. Claim (m) also follows from the 
definition of function ^f,.. For claim {Hi), notice that since / £ i7^(T) by the Sobolev 
embedding theorem / must be bounded, i.e., sup^gT |./(^)| < C. In consequence, ^r°f 
can be written as a finite sum over functions ipr{f{-) — 2jr). Now the result follows 
from a generalization of the chain rule (see e.g. ^j). □ 

In particular. Lemma |43] yields that G H^{T) whenever / e H^{T). In the 

proof of next lemma we use the idea that in some cases o w with suitable choices of 
r > produces lower value than v for the considered functionals. Consequently, we 
obtain information that the minimizers must lie in X ot X^. 

Lemma 4.5. Let a > 1. For every v e iJ^(T) there exists w G i?^(T; [0, 1 + 30e]) 
such that 

(4.6) F^{u,w) < Sa,iMv) + AT,{u,v) + S^,{u,v). 

for all u e H^{T) where Sa,i = 1 when a — 1 and is otherwise zero. 

Proof. Consider function defined by equation (|4.5p . Due to Lemma [4.41 and 
equation ([33]) we have i^f (u, o u) < F^ {u, v) for any (m, v) e H^{T) x iJi(T) and 
a > 1. This immediately yields inequality (|4.6p with w = o v for a > 1. 

Let us then consider the case a = 1 and let {u,v) G H^{T) x iJ^(T). To apply 
folding denote E_ = {v{t) < 0}, i^o = {0 < w(t) < l + 30e} and E+ = {v{t) > l + 30e} 
and by 1^; the indicator function of E. We write 

V — V ■ 1e_ + V ■ Ieq + V ■ 1e+ = v^ + vo + v+. 
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We construct w by applying the folding operation to each restriction separately. First, 
recall identity (|3.5p and denote 

(4.7) G,{v, E) = log(e2 + + ^(1 - d^- 

with any measurable E <zT. Denote then 

— '^i o v^, w+ = ^'30e o (y+ — 1) + 1 and w — W- + vq + w+. 

Clearly < w < 1 + 30e and (u, w) e X^. 

First, we see due to Lemma [4.41 claim (i) that 

\wit)\^\w^it)\ + \voit)\ + \w+it)\<\v{t)\ 

for all t (iT. Furthermore, claim {Hi) in Lemma 14.41 implies |Z?w(t)| — \Dv{t)\ almost 
everywhere on T. These yield 

(4.8) / ((e^ + w^)\Dgu\^ + e\Dw\^)dt < [ ((e^ + v^)\Dgu\^ + e\Dv\^)dt. 

Let us next consider the integrand ge{t) = — log(e^ + 1^) + ^(1 — t)"^ in equation 
(14. 7p and apply the results in Lemma lA. II to functions w_, wq and w+. Due to claim 
(m) and (in) in Lemma lA.ll it is straightforward to see G{w-,E ) < G{v-, E-). 
Furthermore, the claim (z) in Lemma [A. II implies G{w+, E+) < G(w+,i?+). From 
this we conclude 

G{w, T) = G{w+,E+) + G(vo,Eo) + G{w^,E^) 

(4.9) < G{v+,E+) + G{vo,Eo) + Giv^,E^)^G{v,T). 

Now inequalities (|4.8p and (14. 9p together with identity (|3.5p yield the result. □ 

Theorem 4.6. The Junctionals for a > 1 and F^^ for a e R are coercive in 

H^{T) X ffi(T) for any fixed a, nGN ande>0. 

Proof. Recall that a functional G : X ^ M. is coercive if we have a lower bound 

G{x) > C\\ x\\-^ for X G X such that is large enough. By the Lemma |A^ in 

Appendix we know that the functionals are bounded from below. One can deduce 

that 

/ e^\Dgu\^dt^ f {e^\Du\^dt + e^+^i{Quf)dt>G{e)\\u\\l^. 
Jt Jt 

The lower bound for ||f ||^i can be obtained from the term (e\Dv\'^ + - ■u)^) dt. 
Hence it follows that both F"{u,v) and F"„(u,w) go to infinity when ||u||^i or ||u||jLfi 
goes to infinity. □ 

5. Non-edge-preserving scaling. In this section we study the case when s > 
and a = and prove Theorem 13.31 Recall that the claim (i) is shown in 34J. 

Proof of claim {ii) in Theorem \3.S[ Consider the value of F^^ at function 
{u{t),v{t)) = (0, s) where s > 1, namely, 

F^u,v) = g^is) = -N\og{e^ + + 
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where = 2". With fixed s > 1 we have then Um„_>oo 5Ar(s) = — oo. Also, it is easy 
to see that the minimizing values s = s(N) of (7Ar(s) go to infinity if n — > oo. Suppose 
now that the pair (u„,i;„) € PL{n) x PL{n) is a minimizer for problem p.3p and 
that (urt,u„) converges in H^{T) x H^iT). Clearly then 

) + ||AiU„ — TO„||^2 < gN{s{N)) + ||m„||^2 

for all n g N. Since the terms of F^^ are all positive except for the logarithm term 
and the measurements m„ are bounded in L^(T) we have 

(5.1) / -N\og{e^ + {QnVnf)dt < gN{s{N)) + C 
Jt 

for some constant C > 0. The assumption that w„ converge in H^(T) yields that 
\\vn\\^ < C for aU n e N with some C > 0. Thus, inequality ((5T|) implies ~NC" < 
gN{s{N)) + C for all n e N where C" = log(e^ + (Cf). This leads to a contradiction 
since lim„^oo gjy(^Af)) = and proves the claim. □ 

The immediate question after the result (m) in Theorem 13.31 is whether an ap- 
propriate coupling of e and n guarantees the convergence of MAP estimates. In the 
following we give some negative results about this. 

Consider first how the discretization scheme affects the convergence in |j -Ij^-norm. 

Theorem 5.1. Assume u„, G PL{n) for n e N, TV = 2" and F°„(u„, «„) < C 
for some constant C > 0. Then there exists a constant C" > such that 

\\l-vn\\^ < C'VeN + em^. 

Proof. The boundedness of Ff„(M„,D„) and Lemma [A. 21 vield 

-CieN^ + I -{I- Vr^fdt < C2 
Jt e 

for some constants Ci,C2 > 0. This immediately results to 

(5.2) \\1-Vn\\l2 <C2e + Cie^N\ 

First denote tk = k/N for all < < A^- Suppose that / € PL{n) achieves its 
maximum at tj and denote by 0j € PL{n) a function that satisfies 4>j{tk) = Sjk for 
all < fc < iV. Then by using the simple fact that //^_+' f{tj)<j)jit)dt < J^'+^ \f{t)\dt 
we have 

111 - ^^nllco = 111 - '^"lloo 'l'At)dt < VN^ I^\l-Vjt)\^dt 

where in the last inequality we have used the Cauchy-Schwarz inequality. Now in- 
equality (|5.2p proves the claim. □ 

Corollary 5.2. Let e = e(n) such that lim„_»oo e('^) = and let {un,Vn) G 
PL{n) X PL{n) he a minimizer of F^^. Then the following statements hold: 

(i) //lim„^oo \/ e(?T.)2" < 00 then the function w„ converges uniformly to 1 with 
respect to n. 
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(a) //lim„_,oo •y/e(n)2" — oo then the minimum values F^^^ diverge, i.e., 

lim F°^{un,Vn) = -oo. 

Proof. Let us first notice that 
F°„(w„, z;„) < inf F«„(0, v) = inf / f-2" log{e^ + {Q^vf) + ^(1 - dt. 

vePL{n) ' vePL{n) Jj\ 4e / 

The statement (ii) follows the upper bound given in Lemma lA. 21 

Assume now that lim„^oo -\/e(n)2" < co. By using inequality log(l + x) < a; it 
follows that also lim^^o (2"log(l + 0{y/e))) < oo. By simple computations one can 
show limj^o \og(i+o{^)) ~ ^ ^'^^ P ^ h '^^'-^ hence the quantity 

/ 1 \ 3 

23"e2 ^ (2"log(l + 0(^/i)))' ' * 



log(l + 0{V~e)) 



converges to zero. The convergence of 2^"e^ to zero follows by the same argument. 
Consequently, the result {i) follows from Theorem 15. II □ 
For n G N define functionals 

(5.3) Hn{u)= [ hl\D,u\''dt +\\PnAu-m Jl, 

for u e PL{n) where /i„ G i/-'^(T;M-|_) converges to l{t) = 1 uniformly. 

Theorem 5.3. Let e — e{n) such that lim„^oo ^in) ~ 0. We have that 
(a) H = F— lim„^oo^^n in the weak topology of H^{T) 

(h) the functionals {Hn\n&i cire equicoercive in the weak topology of H^(T). 
Proof. Let it„ ^ u weakly in iJ^(T) as ri —> oo where u„ G PLin). By lower 
semicontinuity of the norm we have 

/ IDwI^dt < liminf / |Du„pdt = liminf / hl\D„Un\'^dt. 
Jt Jt Jt 

Furthermore, by the Sobolev embedding theorem we see that u„ — > u in L^(T) and 
hence lim„_>oo ^ri^w„ — m„ = Au — m in L^(T). Together these imply H(u) < 
liminfn^oo ^^Ti('«n). This proves (i) in Definition 12.41 

To prove the condition {ii) in Definition [213] it is sufficient to consider any sequence 
Un e PL{n) such that w„ u in the iJ^(T)-norm as n oo. This proves the claim 
(a) here. Let us then study claim (6) and assume that m„ G PL{n) for every n G N 
and 

(5.4) Hn{Un) = / hl\DqUr,\''dt + ||P„Aw„ - m„||^. < C 

Jt 

for some constant C > 0. In particular, we have ||P„ylu„||^2 < C for all n G N. 

Next we show that also the sequence ||Au„||^2 is uniformly bounded. Assume for 
the moment that this is not the case and lim„^oo 1 1 ^"^tn 1 1^2 = oo. Recall that operator 
Q was defined as Qf — {Jj f{t)dt)l for any / G L^{T). Due to the inequality (|5.4p 
we have that ||Dm„||^2 < C and, in consequence, ||(/ — Q)un|lL2 < C. Moreover, this 
yields 

(5.5) lim \\AQun\\i^2> lim (|| Au„||^2 - ||^(/ - Q)m,i||^2) = oo. 
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By setting c„ = JjUn{t)dt equation (|5.5p implies that 



(5.6) 



lim |c„| ||A1||^2 = oo. 



The boundedness of \\A{I — Q)un\\j^2 yields together with ||P„j4ii„||^2 < C that 

(5.7) |C„| \\PnAl\\^2 = WPnAQUnh^ < \\PnA{I - Q)uJ + ||P„Au„||^2 < C. 

By the condition (|5.6[) we have lim„_>oo \cn\ = oo and by condition (j5.7p it follows 
that lim„^oo II^ti^IIIl^ — 0- Due to the condition (m) in Definition 13.11 this yields 
Al = 0. However, this contradicts with equation (j5.6p . Consequently, we have proven 
that ||Ait„||^2 < C for some constant C > 0. 

By the assumption on A with s > 0, we have < C||ylM„||^2. As 1 S 

{H-''{T))' = ff^(T), we have |Qm„| < C || ■Urall j:/-s and hence we obtain by the Poincare 
inequality that ||w„||j:^i is bounded. By the Banach-Alaoglu theorem there exists a 
converging subsequence which completes the proof. □ 

Finally we conclude this section by completing the proof of Theorem 13.31 

Proof to claim (in) in Theorem \ 3.3\ Suppose that lim„_+oo e(?i) = and that 
(u„,w„) £ PLin) X PLin) minimizes F'^^^^ ^. By CoroUarv 15.21 either the minimum 

values of F^^^^-^ ^ diverge to — oo or t;„ ^ 1 uniformly. 

Consider the latter case. Then the functions m„ clearly solve minimization prob- 
lems min„gpi(„) Hn(u) where Hn{u) is defined in equation (|5.3[) with /i„ = + vf^. 
By Theorems 12.61 and 15.31 it follows that w„ converge to a minimizer oi H . □ 

6. Convergence proofs. 

6.1. Convergence with respect to n.. In this section we prove F-convergence 
of with respect to n for all scalings a > 1. Throughout the section, < s < i. 

Theorem 6.1. For a>\ we have F" — F— lini„^oo^""„ in the weak topology of 
H^iT) X i/i(T). 

Proof. Let us assume that (M„,f„) converges weakly to {u,v) E H^{T) x H^{T). 
By the Sobolev embedding theorem H^{T) embeds compactly to the space of Holder 
continuous functions with exponent less than 1/2 and thus we have Vn v strongly 
in C°''^(T) for any r < 1/2. Furthermore, it follows that 

(6.1) SUp|0„V„(i) - W„(i)| <SUpA^ / \Vn{t) - Vr,{t')\dt' < \\vJ(.o.. 




where j{t) is such that t E 



jit) ~ L Af ' N 



). Now we see 



WQnVn - V\\^2 < WQnVn ~ WnHioo + - v\\ 







as n — > oo. The immediate consequence is that 




when a> 1. Moreover, it also holds that 
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Let us now consider the condition (i) in Definition l2.4l of F-convergence. Assume that 
{un,Vn) — > {u,v) in wcak topology of H^{T) x H^{T) as n oo. Since Dv^ Dv 
weakly in L^(T) and since a norm is lower semicontinuous we deduce that 



(6.2) / e|Dt;|^di < liminf / e\Dvn\ dt. 

Without losing any generality we may also assume F^^{un,Vn) < C < oo since 
otherwise there is nothing to prove. Hence in particular Jj \DqUn\'^dt < C/e^ and 

lim / \v^ - vl\\DqU„\''dt < lim y ^ vl\\ ■ ^ ^ 0. 

Consequently, we obtain 

/ (e^ + ^2)113^^12^^ < liminf / (e^ + vl)\DqUr,\^dt + lim / (w^ _ vl)\DqUn\^dt 

(6.3) = liminf / (e^ + vl)\DqUn\^dt. 



due to lower semicontinuity of the norm ||-|| = || Ve^ + and the weak con- 

vergence of DqUn- By combining all inequalities above it follows that F"{u,v) < 
liminf„_>oo ^""„(it„, f„). This proves (i) in Definition 12.41 

For condition {ii') in Definition 12 . 41 we note that for an arbitrary (u, v) e H^{T) x 
H^{T) there exists a sequence (m„,z)„) € PL{n) x PL{n) converging to (u,w) in the 
strong topology. It is easy to see that one can then change liminf into lim and 
inequalities to equalities in (|6.2[) and (|6.3p . This yields the claim. □ 



6.2. Convergence with respect to e. Let us prove the F-convergence for a 
modified functional. Define : L^{T) x L^{T) (—00,00] as 



F^{u,v) when {u,v) G X, 
00 otherwise. 



Theorem 6.2. It holds that MS = F-lim£^oSe in the strong topology of L^{T) x 
ii(T). 

Proof. First we show that condition (i) of Definition 12.41 holds. Suppose that 
lim£_>o(we, We) — (w, v) in L^{T) x L^{T). As in the previous F-convergence proofs we 
may assume without losing any generality that 

(6.4) liminf E;e(ue, We) < C < 00 

and (uc, We) e X. By using the same technique as in Lemma [A. 2 1 we can show a lower 
bound 

(6.5) E,{u„v,)>L,{ve)+ [ ^il-v,fdt>~C'e 

Jt oe 

for some constant C > 0. Moreover, inequalities (|6.4p and (|6.5p yield 

(6.6) / ^{l-v,)^dt<C + C'€ 
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and hence in particular ^ 1 in L^(T) as e ^ and v — 1 a.e. Since < < 1 we 
have by Lemma lA. 31 that hm^^^o L^{v^) = 0. 

Let us next show that hm^^o S^{ue, v^) — 0. Again since < 1 we have 



(6.7) \Sliu„v,)\ < Ce%,J \DqU,\dt < Ce%J J \Dgi 



where &e = [/rj, Since u in L^{T) then also — s- | < oo. By 

assumption it holds that \DqU^\^dt < C so that since g > 1 we obtain that the 
right-hand side of inequality (|6.7p converges to zero as e — s- 0. 
Now Theorem 12.71 implies 



MS{u, v) < liminf Ff^(ue, v^) + lim Lg(ve) + lim 5*^ (ue, Ve) — liminf Se(Mg, v^). 

This yields condition (z). 

Next let us consider condition (ii). By Theorem 12.71 for any {u,v) E L^{T) x 
L^(T) there exists a sequence {{ue,Ve)} e X such that Hmsup^^Q F/^-^(ue, We) < 
MS{u,v). By assuming that MS{u,v) is bounded we obtain inequality (|6.6[) for 
We and Lemma IA.3I yields lune^o L^{v^) = 0. Since also \Du^\'^dt < C the 
convergence of S^(ue,v^) to zero follows from the estimate 

\S^Aue,v,)\ < C (^e%, j \Du,\dt + e'^l^ 

where fcg = \ ^jU^dt\. Finally we can conclude that 
lim sup Se («£,«£ ) — limsupi^j'*"^(ue,z)e) + lim L^{Vf) + lim S'^(Me, w^) < MS{u,v). 

£^0 £^0 s-'O 

This proves {ii) in Definition 12.41 and hence the claim follows. □ 

Theorem 6.3. We have that MS = F-lim.^o-Fe" in L^{T) x ^^(T) for any 
a > 1. 

Proof. We prove only the case when a — 1. For a > 1 the proof is obtained by 
leaving out the considerations regarding the term L^. 

Notice that the functionals and differ only in set \ X. Obviously 
since X d X^ the condition (ii) in Definition 12.41 follows immediately from inequality 

< and Theorem O 

Let us then consider the condition (i) and let {u^,Vg) e be a sequence con- 
verging to {u,v) £ L^{T) X L^(T) as e ^ 0. By assuming that sequence F^{u^,Vf^) 
is bounded we notice as in the proof of Theorem 16.21 that w = 1 almost everywhere. 
Consider the folding operation 'Si defined in equation (|4.5p . One can easily show that 
since < < 1 -I- 30e we must have = o ~^ v in L^{T). Furthermore, due to 
Lemma l4l4l we have 



Clearly we have Le(^i o Vf) — L^{Vf) — > as e ^ and thus 

MS{u,v) < liminf S(Me, Uc) < liminf F}{u^,v^) 

£— >0 £— >o 
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which yields the result. □ 

Lemma 6.4. Let a > 1 and u g ^^^(T). For any sequence {ej}°li, 0, 
there exists functions {(ue^. , )}°^]^ C X converging to (u, 1) in L^(T) x L^(T) smc/i 

(6.8) MS{u, 1) = lim F^{u,^,v,^). 



Proof. Since u € SBV{T) C i°°(T) we may apply Theorem O for p = 2 to see 
that there are {(we^, Wj^. )}°^]^ C X converging to (w, 1) in L^(T) x L^(T) such that 
MS{u, 1) limj^oo -F'c'^"^(ue , We . ). Following the proof of Theorem 16.21 we can show 
\imj^oo{Lej{vej) + Sj.{u^.^Vi.)) — and the claim follows. □ 

6.3. Convergence of minimas. Let us show the equi-coercivity for the se- 
quences of functionals studied above. 

Lemma 6.5. Assume that v £ H^{a,b), a < 6, a, 6 e R and maxjgj^ u(t) — 
min(g[a ft] v(t) > T. Then it holds that \Dv{t)\'^dt > 

Proof. By the Sobolev embedding theorem v can be extended to a continuous 
function on [a, b]. Denote by and t_ points in [a, b] where 

v{t^) — max v{t) and v(t^) — min v{t). 

te[a,b] te[a,b] 

Without losing the generality we may assume that t^ > Then using the funda- 
mental theorem of calculus we see that 

T < vit+) ~ vit_) < ^ ^ \Dvit)\dt < p«|L.(„,ft) Vb^. 

This proves the statement. □ 

Let us next prove the equi-coerciveness of the functionals F^. 

Theorem 6.6. Let a > 1, C > 0, and {ue,Ve) £ H\T) x H\T), e > 0, be a 
sequence such that 

(6.9) F^{u„v,) + \\Au^-m\\l.2 <C. 

Then there exists subsequence {(ucj , I'e^ )}j^i which converges in L^{T) x L^(T). 

Proof. The proof is principally the same for both cases a = 1 and a > 1. First 
notice that the assumption (|6.9p yields — 1||^2 < Ce and hence the convergence 
of to 1 in L^(T) is clear. The case for follows by considering carefully how the 
convergence of takes place. Let us first fix some e > and divide the domain T 
into K — [ij -|- 1 half-open intervals — ■^^) where [ij denotes the largest 
integer less or equal to 1/e. Moreover, denote 

Xf — lo<k<K : max Vf (t) — min Vf(t) > - > . 
1 tei!< teil< 4 1 



From Lemma 16.51 and inequality (j6.9p we deduce that 
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and hence tt(Ie) < 16C". Furthermore, denote 

Je^\ke{0, ...,N - 1} \ Je I min v,{t)<^\. 

If j e J7e we observe that the minimum of on the interval is less than 1/4 but also 
the oscillation is less than 1/4. In consequence, we have that 1^^ C {t £ T \ Ve{t) < ^} 
and the boundedness in inequality (|6.9p yields 



tt(Je) • 4 < 4 /(I - < 16eC 



and thus jj(j7e) < 16C" . Consider the union of all intervals in the complement of 
Te U J7e. Since the number of indeces in U J7e is less than L = [16(C" + C")\ + 1 
it follows that its complement (Je U JeT can be presented as a union of at most L 
half-open connected intervals Kj so that 

Next we obtain L'?-boundedness for Me with some g > 1 by applying the Poincare 
inequality on each interval ij^ , j £ 1^ U J'e, and to every Kj. Let / C T be an open 
connected interval and p > 1 such that - = ^ + s. By the Poincare inequality we have 

(6.10) ||zie-6e|!iP(,) <c|/|||i?^.e|Lp(,) 

where 6e = jTy JjUf{t)dt is the average of Ue on / which by Lemma lA.41 satisfies 
l^el < C|/|~P. Using triangle inequality to estimate the left-hand side of (|6.10p we 
obtain 

(6.11) ||Ue|L.(,) < C\I\ P^.e|Lp(,) + < C{\I\ \\Du,\\^,^j^ + 1). 

Let us now apply the above inequality to 

L 

feei^uj", j=i 
For any interval I^f with A; e Xe U J7e we have by the Holder inequality that 
(6.13) |/f I \\Du,\\^,^jK) < e \\Du,\\^,^jK^ < \\eDu,\\^,^j^^ < Ce^ 

On any interval Kj, I < j < L, we know that We > ^. By the boundedness assumption 
(16.91) we have then 



(6.14) P"e|Lp(u^^^K^) <C 

Applying inequalities ((6lI1) . ((03)) and ((6?T4l) to ([02)) yields 



< C". 



l^e|lLP(T) 



<C| ^ (e« + l) + X:(|i^.l + l)l <C 
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since L and (KXc U J7c) were bounded. Notice that this bound is uniform with respect 
to e. By the Banach-Alaoglu theorem we can extract a subsequence, denoted also by 
Me, which converges to some u G L^iT) weakly. 

Next, by inequality (|6.14p we obtain ||'"e|lvi/i,9(j(e)) < C* for some constant C > 
independent of e. Let us then extract a subsequence {u^-}^i in the following way: 
denote = {k/K : k g IjUjTe} C T. Let {ej}j?,i be such that the set Z^^ converges in 
the Hausdorff topology to some discrete set Z such that jl(Z) < L. Note that J{ejY is 
included in an ej-neighbourhood of the set Z^. . Then it follows that for every £ S Z+ 
and (l/£)-neighborhood of Z we have Hwe^ ||^;(^i,,^^<:-) < C. Furthermore, by the 
Banach-Alaoglu theorem we can extract another subsequence, denoted also by Me . , 
which converges weakly in W^''^{Uf) for all i. The Sobolev embedding theorem then 
yields that this subsequence also converges strongly in L^iUI). We conclude that 

^Inn ||M-MeJ^i(T) < (ih " "^Jli(W,) + Ih " "^Jli(W,=)) 



for any (. G Z+ where 1/p' = 1. Finally, the result follows since I was arbitrary. 

□ 

Theorem 6.7. Consider fixed e> Q. Let {un,Vn) ^ PL(n)x PL{n) he a sequence 
such that 

(6.15) F;^„K,t;„) + P„M„-m||^. <C 

for some constant C < oo. Then there exists subsequence (m„.,u„.) which converges 
weakly in H^{T) x H^{T). 

Proof. We see that J.^ e'^\DqUn\'^dt and thus ||M„||jLfi are uniformly bounded for 
all n. Furthermore, boundedness of (e|_Dwp + -^{1 — w)^) dt yields a bound for 
llfnlljifi. Then the claim follows by the Banach-Alaoglu theorem. □ 

Proof to Theorem \3.4\ The claim («) follows easily by Theorems 12.61 [6T] and [67fl 
For the claim (m) consider the case a = 1 and let (ucWg) G be a minimizer of 
problem (13. 6|) . By the equicoercivity theorem l6.6l we have a subsequence {ej}°°^i such 
that Mg . converges to 1 in (T) and to some u in (T) . Let us then prove that u is 
a minimizer of problem p.7p . Let u G L^(T) be such that MS{u, 1)+R{u) < oo. Then 
clearly u G SBV{T) and by Lemma l^^ there exists a sequence {u^^ , v^.)J^i C such 
thatlimj^o,(Me,,We,) = (M,l)in L^{T)y,L^{T) and M5(m, 1) = lim.Io, F^^ (u^, , 
Since the residual R is continuous in L^(T) and lower semicontinuous in L^(T) we 
obtain using Theorem 16.31 that 



MS{u, 1) R{u) < lim inf F^. (u. , ) + lim inf R{u,^ ) 
< lim inf + 



(6.16) =MS'(u,l) + i?(M). 

This proves that (u, 1) is a minimizer of MS + R since m G L^(T) was arbitrary 
function for which (|6.16p is finite. The case a > 1 follows identically. □ 
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Finally, before the numerical results, we present two remarks on issues discussed 
in the introduction. 

Remark 1. Let us consider the case when a ^ k in equation Then the 

minimization problem of finding MAP estimates can be written as 



mm 



-N'-" log(e^ + {Qnvy) + {e' + {Qnvr)\Dqu\' 

+e\Dv\^ + ^(1 - vf + N'^-"'\PnAu - m^\^^dt. 

In consequence, the residual term becomes over or under weighted in the limit regard- 
less of the particular choices of a or k. 

If a ~ K > 1 then our results are summarized in Theorem \3.4\ and the MAP 
estimates converge. 

Remark 2. Let us consider statement (b') in the introduction. When a>\, 



^{{UZ,,<t')h)<e-^N-^\\D-^(^\\l, 



4e 



-1/2 2 



and 



where if) G -L^(T). Applying this for the orthonormal basis {ej}'jL-oo *^ ^^(T), Sj{t) = 
exp(27riji), yields easily with a fixed e > that 

lim E ||v;f, - if = and lim E ||?7",fr2 = 0- 

In particular, this implies that the random variable (f/"e,Kfe) ''^ i^(T) x L^(T) 
converges to (0, 1) in probability as n oo ^3 71. 

7. Numerical considerations. In this section we study the qualitative be- 
haviour of MAP estimates by giving a numerical example with the scaling a = 1. 
Our purpose is to demonstrate that the MAP estimates do behave numerically in a 
similar manner in all discretizations, i.e., for different choices of parameter n. This 
can be expected given the results in Theorem 13.41 

The numerical simulations for convergence of the CM estimates are demonstrated 
in [31] in the case a = 0. 

7.1. The model problem. We consider a Bayesian deblurring problem with 
linear operator A = (/- A)-"/^ : L'^{T) H''[T) for a given < s < 1/2. Notice that 
this operator satisfies condition < |jAu||^2 for any u £ iJ^''(T). We assume 

the measurements to be obtained via projections P„/ — ^f^_j^{f,ej) H-'^xH'^Sj for 
any / G H~^{T) where the L^-orthonormal basis functions {ej}|^_^ are ej{t) = 
exp{—2TTijt) for t E [0,1). It is straight-forward to show that projections P„ are 
proper measurement projections in the sense of Definition 13.11 

Let us now introduce some notation. For any n G N we denote 0" G PL{n) a 
function such that (j)j{k/n) — S^j. The basis C PL{n) is called the roof-top 

functions. Let then B„ G M^^^ be a matrix such that (B„)jfc — ((/)",0^)i2 for 
1 < j, fc < iV. 

In the following we use bolded symbols to denote the coefficients of any func- 
tion / G PL{n) presented in the roof-top basis, i.e., if / = Y^^^i^j^l then f = 
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(fi, ...jfAf) G M.^ . Furthermore, denote by D„ and Q the matrix presentations of op- 
erators Dq, Q : PL{n) PC{n), respectively, from the roof-top basis to the basis of 
piecewise constant functions {xi"}jLi C PC{n) where /j^ = [j/iV, (j -I- l)/iV). Fur- 
thermore, denote the matrix A(v) = diag(e^ -I- (Qv)|) g M^^^. With these notations 
the functional F"„ written in terms of the coefhents in the roof-top basis functions 
has the form 

Ff_Ju,v) = -7V-"log(detA(v)) + l(D„u)^A(v)(D„u) 
(7.1) +^ ||D„v||2 + ^(1 - vrB„(l - v) + ^ ||A„u - m||2 

where A„ e ]R(2Af+i)xAf jj^g^pg ^ ^j-^g coefficients of PnAu in the basis {ej}^_jY. 

The components of matrix A„ satisfy {An) jk — {^j, (^^^) "^^^^ for — < j < N 
and 1 < A: < AT. 

7.2. Computational methods. Because of the non-quadratic terms in 

we have chosen to implement an alternate minimization scheme (see e.g. |14jV The 
convergence of such a method is studied in [48^ in a setting without the logarithm 
term L^. Producing a convergence result in our case lies outside the focus of this 
section. 

Let us now write in pseudo-code how the minimizers are achieved: 

(1) Initialize u",v° e and set j := 1. 

(2) Solve the equation (-i:DjA(v.'~^)D„ -I- A^Aj) u = An^m and set u^^ = u. 

(3) Solve the minimization problem 

min f-iV"-Mog(detA(v)) + 4(D„u,)^A(v)(D„Uj) 

v^E-^ y iV 

+ ^l|D"V|l2 + ^(l-vrB„(l-v)) 

and set = v. 

(4) If (u^v^) satisfies F^_„(u^v^') < F^_„(uJ-i, v^-i) - (5 go to step (2); else 
stop. 

7.3. Results. We implemented the problem with operator A having parameter 
s — 0.35 and measurement noise with variance cr = 5 x 10"'^, i.e., replace iV"" 
in equation (|1.8p with <tN~'^. Furthermore, the scaling of the prior is assumed to 
be a = 1. We used four different sets of data with two true values of u and two 
discretization sizes A^ = 512 and A^ = 2048. The MAP estimates were computed with 
sharpness parameters e = 2 x 10^^, 1 x 10^^, 6 x 10^^. The reconstructions are shown 
in Figures O and O 

In Figure 17.11 the true value of u is a simple step function. We have weighted 
the residual with constant c = 14. In Figure [7?2l the true value is piecewise smooth 
with jJ(S'„) — 4 and the residual was weighted with c = 10. The initial values in all 
the computations were vectors u*^ = and = 1. The step (2) in the algorithm 
was implemented by using Matlab's backslash function and in the step (3) we used 
a gradient-descent method by choosing step-sizes with a line search algorithm. The 
minimization in step (3) was stopped when either no satisfying step-size was found or 
the values of the functional did not change by high accuracy. All computations were 
stopped at 50 iteration. 
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We perform all the computations with Matlab 7.6 running in a desktop PC com- 
puter with Dual Intel Xeon processor running at 2,80 GHz and 4 GB of RAM. Compu- 
tations took less than 10 seconds for TV = 512 and less than 80 seconds for N = 2048. 

7.4. Discussion. A visible feature of Figures 17. Il and l7.2l is that the reconstruc- 
tions do not change qualitatively by increasing the discretization parameter n. This 
result is in line with Theorem 13.41 i.e., if one fixes e > and takes n to infinity then 
the minimizers converge to Ambrosio-Tortorelli minimizers. 

It is also evident that the parameter e controls how sharp reconstructions one can 
obtain. In Figure 17.21 this is visible with the second peak. Namely, with the value 
e = 0.02 this peak is smoothened whereas with the other values the reconstruction 
becomes sharp. 

The convergence of the algorithm was satisfactory especially for u. In most of 
the runs the value of u was achieved very accurately with less than 10 iteration steps. 
However, the function v still evolved slowly after this and a satisfactory estimate was 
obtained with 50 iteration steps where also each run was stopped. The authors expect 
that this slowness can be overcome by more sophisticated minimization algorithm in 
the step (3) of the algorithm. 

Appendix A. Technical lemmata. 

A.l. Properties of domain X^. In the definition of domain in equation 
p.4p one restricts the values of function v in the pair {u,v) G H^{T) x H^{T) to the 
interval [0, 1 + 30e]. Let us now discuss some properties related to this choice. Define 
function ge{t) = - \og{e'^ + + - t)"^ for t € R where e > is fixed. 

Lemma A.l. Assume that < e < ^. The function has a unique minimizer t^ 
which satisfies 1 < < 1 -\- 30e. Furthermore, the inequality g^it) < gds) holds when 
s and t satisfy one of the following conditions: 
(i) \ <t<l + 'iQ€ and s>l + 30e, 

(ii) t g [0, 1] and s < —1 or 
(Hi) t S [0, 1] and s — — i. 

Proof. Clearly, one has ge{t) < ge{—t) for any t e M. This proves claim (Hi) and 
since limt _>oo 3e (t) ~ oo this also shows that the global minimizer has to be located 
in M+. 

The derivative Dg^ has the form Bg^(t) = + - 1) for < G M. The first 

term is negative everywhere in R_|_. Since the second term increases linearly and is 
positive for f > 1, the zeros of Dg^ on R+ have to be greater than 1. Also since 
limt^oo -D^e (i) = oo and the first term is strictly decreasing for < > 1 the function 
has a unique zero if for t > \. This yields the existence of a unique minimizer for 
(7e. Furthermore, claim (ii) can be easily deduced since Dg^it) < for t < 1 when 
e < 1/8. 

Let us now show an upper bound to t^. Apply inequality -^j^-pi < j for t > 1 
to obtain a lower bound to function Dg^. By solving equation Dg^it) > —j^ + 
■^{t+ — 1) = for t+ > 1 one obtains a bound t^ < t+. A short computation yields 
tl = l + i(VTT8^ - 1) < 1 + 2e. 

Finally, let us study the claim (j). From above it is evident that there exists 
a unique point > such that gdse) — 5e(l)- In the following we show that 
< 1 -f 30e. Denote h^{t) = ge{t) — 5e(l)- Then we have that 

(A.1) Kit) = - log ^ + ^(1 - 0^ > 1 - ^ + ^(1 - t)^ 
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for any t > 1 where we have used inequahties —logx > —x + 1 for x > 0. The 
quadratic function on the right-hand side has a zero in ii = 1 and with a detailed 
calculation one can show that the second zero satisfies ^2 < 1 + 30e for e < i. □ 

A. 2. Auxiliary bounds. Here we show auxiliary technical lemmata. Define 
G.,„K,6) =^ (^-7Vlog(e2 + (Q„i.„)2) + l(l-^;„)2^ dt 

where Vn £ PL{n), 6, e > 0, a G M and n eN. 

Lemma A. 2. For any 0<e<^,nEN and b > there are constants C and 
C{b) such that 

-C{b)eN^ < inf G, „(w, b) < -C{^feN - 1). 

vePL(n) 



Proof. The upper bound for the infimum follows by setting v = 1 + y/e and using 
inequality log(l + x) > -^x for small x > 0. For the lower bound first notice that 



-log(e2 + {Q„vf) > -21og + (Q„w)2 > -2 log(e + |g„?;|) > -2{e+\Qnv\ - 1). 
Since \Qnv\dx < \v\dx it holds also that 

(A.2) [ -N log(e2 + {Q^v f)dx > [ ~2N{e + \v\ - l)dx. 

Now denote h^{t) ^ -2N{e + |t| - 1) + ^(1 - i)^ for any t e R. Clearly we have 
he{~t) > h^{t) for t > 0. For positive values of t function is quadratic function 
h,{t) = -2iVe-2A^(t-l)+^(i-l)2 with respect to variable > -1. The minimum 
of this function is obtained when t — 1 = |iVe and thus h^(t) > —2Ne — It is 

now easy to verify that 



/ -2N{e + \v\ - 1) + ^(1 - vfdt > -jN'^e - 2Ne > -C{b)N'^e. 

Together with the inequality (jA.2p this yields the claim. □ 

Lemma A. 3. Assume that a sequence e H^iT; [0,1 + Ce]) satisfies — 
VeY'dt < Ce for some constants C, C > 0. Then it follows that lime^o L^^v^) — 0. 

Proof Let us denote E, ^ {t e T \ v,{t) < ^} for e > 0. The Lebesgue 
measure of is bounded by \E^\ < Ce and thus Jg | log(e2 + v-^)\dt < 2Celoge 
which converges to zero as e ^ 0. Denote = ma.x{ve, ^). Clearly also — s- 1 in 
L'^(T) and hence by the Lebesgue dominated convergence theorem liia^^o Lf:{ve) < 
limj^o (Leive) + 2Celoge) ~ 0. This proves the statement. □ 

The following lemma is proved in [57] in more detail. 

Lemma A. 4. For any < s < ^ , u e {a,b) H~^{a, b) with a, 6 G M such that 



b > 



a we nave 



£udt <C\b-a\i 



"\m\H-^{a.b)- 



Proof By 44] the dual space of H-^a, b) is iJ^(a, 6) = {/ G H^R) \ supp(/) C 
[a,b]} with norm \\f\\H^(a,b) = II/IIh--(R)- Furthermore, the mapping T : f ^ /l(a,&) 
is continuous in _ff*(M) for any —1/2 < t < 1/2. In particular, we have that the 
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l(a,6) S H^{a,b). Without losing any generality we assume a = —b. The Fourier 
transform of l(-6,6) satisfies !(_(,_(,) (^) = (751^^ and thus 



udt 



< 1 



i-b,b)\\H'>{-b,b) ll"llH-n-6,6) 



-ii: 

< C'b^-' \\u\ 



sin 6^ 



1/2 



'(-6,6) 



H-=(-6,6) 



for some constant C" > 0. □ 
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Fig. 7.1. A step function. Left: the true value of u and the noisy measurement m„ = M„{liJo). 
Middle: u^^^ estimates. Right: v^'^^ estimates. The thick, dashed and thin lines represent 
reconstruction with sharpness e = 2 X 10~^,1 X 10~^,6 X 10~^, respectively. Axis limits are the 
same in each plot. 
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Fig. 7.2. A piecewise smooth function. Left: the true value of u and the noisy measurement 
m„ = JWn(aJo). Middle: u^'^^ estimates. Right: v^'^^ estimates. The thick, dashed and thin 
lines represent reconstruction with sharpness e = 2 X 10~^,1 X 10^^,6 X 10~^, respectively. Axis 
limits are the same in each plot. 



